190 ◾ Bioinformatics
As shown in Figure 5.15, the library sizes or the sequencing depths of the six samples are
similar. This bar chart gives an idea about the distribution of the library sizes and any
potential source of bias from the library sizes.
In the normalization step, we normalized the count data to eliminate composition biases
between libraries. We can assess the TMM normalization by the MD plot (mean-difference
plot), which displays the library size-adjusted log-fold change (difference) between two
libraries against the average log-expression across these libraries (the mean). The points on
the MD plot should be centered at a line of zero log-fold change if the biases between librar-
ies were removed successfully by the normalization. The “plotMD(y, column=i)” function
creates MD plot by converting the count (y) to log2-CPM values and then creating an
artificial array by averaging all samples other than the sample specified (column=i) in the
FIGURE 5.16 Mean-difference plots.